County Armagh
Improving Factual Error Correction for Abstractive Summarization via Data Distillation and Conditional-generation Cloze
Li, Yiyang, Li, Lei, Hu, Dingxin, Hao, Xueyi, Litvak, Marina, Vanetik, Natalia, Zhou, Yanquan
Improving factual consistency in abstractive summarization has been a focus of current research. One promising approach is the post-editing method. However, previous works have yet to make sufficient use of factual factors in summaries and suffers from the negative effect of the training datasets. In this paper, we first propose a novel factual error correction model FactCloze based on a conditional-generation cloze task. FactCloze can construct the causality among factual factors while being able to determine whether the blank can be answered or not. Then, we propose a data distillation method to generate a more faithful summarization dataset SummDSC via multiple-dimensional evaluation. We experimentally validate the effectiveness of our approach, which leads to an improvement in multiple factual consistency metrics compared to baselines.
- Europe > Spain > Balearic Islands (0.04)
- Europe > United Kingdom > Scotland > Na h-Eileanan Siar (0.04)
- Europe > United Kingdom > Northern Ireland > County Armagh (0.04)
- (4 more...)
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Ivison, Hamish, Smith, Noah A., Hajishirzi, Hannaneh, Dasigi, Pradeep
Obtaining labeled data to train a model for a task of interest is often expensive. Prior work shows training models on multitask data augmented with task descriptions (prompts) effectively transfers knowledge to new tasks. Towards efficiently building task-specific models, we assume access to a small number (32-1000) of unlabeled target-task examples and use those to retrieve the most similar labeled examples from a large pool of multitask data augmented with prompts. Compared to the current practice of finetuning models on uniformly sampled prompted multitask data (e.g.: FLAN, T0), our approach of finetuning on cross-task nearest neighbors is significantly more data-efficient. Using only 2% of the data from the P3 pool without any labeled target-task data, our models outperform strong baselines trained on all available data by 3-30% on 12 out of 14 datasets representing held-out tasks including legal and scientific document QA. Similarly, models trained on cross-task nearest neighbors from SuperNaturalInstructions, representing about 5% of the pool, obtain comparable performance to state-of-the-art models on 12 held-out tasks from that pool. Moreover, the models produced by our approach also provide a better initialization than single multitask finetuned models for few-shot finetuning on target-task data, as shown by a 2-23% relative improvement over few-shot finetuned T0-3B models on 8 datasets.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > Italy (0.14)
- North America > United States > Ohio (0.04)
- (15 more...)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Leisure & Entertainment > Sports > Football (1.00)
- Law (1.00)
- Government (1.00)
- Media > Television (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.90)
Government minister to demand Tinder and Grindr explain what they're doing to protect children
The culture secretary Jeremy Wright is to question Tinder and Grindr about measures used to protect children after police records showed they are at risk of grooming and sexual exploitation on the dating apps. The Secretary of State for Digital, Culture, Media and Sport (DCMS) said he was "truly shocked" to discover the perpetrators of child sex offences had used online dating services. Mr Wright said: "I will be writing to these companies asking what measures they have in place to keep children safe from harm, including verifying their age. "If I'm not satisfied with their response, I reserve the right to take further action." Police have investigated more than 30 incidents of child rape since 2015 where victims were sexually exploited after evading age checks on dating apps, according to The Sunday Times. Dwain Chambers made his sprint comeback in the 60m event at the British Indoor Championships. The 40-year-old came in second during his heat with a time of 6.78 however after a ...
- Atlantic Ocean > North Atlantic Ocean > English Channel (0.05)
- Europe > United Kingdom > England > Tyne and Wear (0.05)
- Europe > France (0.05)
- (15 more...)